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1 Introduction 


This report describes the research work undertaken at the Massachusetts Institute of Technology, 
under NASA Research Grant NAG4-157. The aim of this research is to identify effective algorithms 
and methodologies for the efficient and routine solution of flow simidations about complete vehicle 
configurations. 

For over ten years we have received support from NASA to develop unstructured mesh methods 
for Computational Fluid Dynamics. As a result of this effort a methodology based on the use of 
unstructured adapted meshes of tetrahedra and finite volume flow solvers has been developed. A 
number of gridding algorithms, flow solvers, and adaptive strategies have been proposed. The most 
successful algorithms developed from the basis of the unstructured mesh system FELISA. The FE- 
LISA system has been extensively for the analysis of transonic and hypersonic flows about complete 
vehicle configurations. The system is highly automatic and allows for the rotine aerodynamic anal- 
ysis of complex configurations starting from CAD data. The code has been parallelized and utilizes 
efficient solugion algorithms. For hypersonic flows, a version of the code which incorporates real 
gas effects, has been produced. The FELISA system is also a component of the STARS aeroservoe- 
lastic system developed at NASA Dryden. One of the latest developments before the start of this 
grant was to extend the system to include viscous effects. This required the development of viscous 
generators, capable of generating the anisotropic grids required to represent boundary layers, and 
viscous flow solvers. In figures 1 and 2, we show some sample hypersonic viscous computations 
using the developed viscous generators and solvers. 

Although this initial results were encouraging it became apparent that in order to develop a 
fully functional capability for viscous flows, several advances in solution accuracy, robustness and 
efficiency were required. In this grant we set out to investigate some novel methodologies that could 
lead to the required improvements. In particular we focused on two fronts: finite element methods 
[1], [26] and iterative algebraic multigrid solution techiques [30]. 

Finite element algorithms have been enormously successful in the field of structural mechanics 
and in that field, they are the method of choice. They have a solid mathematical foundation 
and offer complete geometrical flexibility. Finite element methods utilize compact supports, which 
substantially aid the implict solution procedure, and, when properly formulated, can be extended 
to arbitrary orders of accuracy. 

Multigrid techniques are well known for their efficiency in solving symmetric positive definite 
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problems. Unfortunately, the discretization of the Navier-Stokes equations generates systems of 
equations which are unsymmetric and much harder to solve. 

The work reported below has been partially supported by this grant. 

2 Finite Element Solution of the Compressible Flow equations 

Our aim is to develop a robust and accurate finite element algorithm which is capable of solving flows 
ranging from low subsonic to transonic and hypersonic regimes. We regard the possibility of using 
higher order approximations as being highly attractive, specially for complex three dimensional 
flows which requiring high levels of resolution. 

Here we present some of the accomplishments carried out during this grant period and which 
build upon existing finite element methods. 

2.1 The solution of the Compressible Euler Equations at Low Mach Numbers 

For low Mach numbers, the compressible Euler and Navier Stokes equations describe almost incom- 
pressible flow. This singular limit of the compressible flow equations is reasonably well understood. 
Indeed, under the assumption of isentropic flow, and some regularity conditions on the initial and 
boundary data, the solution of the compressible equations, in the zero Mach number limit, can 
be shown to satisfy the incompressible flow equations [16, 11]. From the computational point of 
view, accurate solutions of nearly incompressible flows are difficult to obtain. This is due to the 
very different magnitude of the wave speeds which are present in the system. Our interest is in 
the development of a compressible flow algorithm which is capable of solving flows ranging from 
almost incompressible to supersonic regimes. There are several reasons that justify the development 
of such algorithm. The first and most important is that in many situations, such as high angle 
of attack aerodynamics, large regions of very low Mach number coexist in the flow domain with 
regions where the flow is supersonic. Another more practical motivation is that an algorithm that 
can successfully handle free stream Mach numbers as low as 10“ 3 is well suited for a broad range 
of applications typically handled with incompressible formulations. 

Over the last few years, stabilized finite element algorithms for the solution of the Euler and 
Navier-Stokes equations have gained increased popularity. Streamlme-Upwind/Petrov-Galerkm 
algorithms (SUPG), and some of its relatives, have been presented and analyzed in numerous papers 
[14, 15, 20]. These algorithms, although not so widely used as their finite volume counterparts, 
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possess many attractive features. In particular, they have a compact support and can be used 
with elements of arbitrary order, yielding solutions of increased accuracy whenever the solution is 
sufficiently smooth. In addition, there exists a rather well developed theory for linear problems 
which can be used to provide design criteria for the development of successful algorithms for non- 
linear equations. 

One of the critical ingredients of SUPG algorithms is the construction of the stabilization matrix 
r. Whilst the convergence analysis for the linear problem only dictates that this matrix has to 
be symmetric positive definite, scale appropriately with the local grid size, and have dimensions of 
time, much freedom is still available to fully determine it. Only under some very simplified cases 
(e.g. one dimensional flow) is the optimal choice of r unambiguous. 

Classical compressible flow formulations, including the existing SUPG algorithms, fail to give 
adequate numerical solutions when the flow approaches incompressibility. Whenever the Mach 
number is progressively reduced, keeping the grid size fixed, a degradation in the solution accuracy is 
observed. This phenomenon has been studied extensively in the context of finite volume algorithms 
[23, 18, 6], and several explanations have been proposed. The most common argument is based 
on the mismatch which occurs, for low Mach numbers, between magnitude of the fluxes in the 
original equations and the corresponding terms in the numerically added artificial viscosity. Some 
local preconditioning strategies, aimed at modifying the numerical artificial viscosity to avoid this 
mismatch, have proven to be extremely successful [3, 24, 4, 25]. Accurate solutions for Mach 
numbers as low as 10“ 3 have been reported using such preconditioned algorithms. One of the 
main drawbacks of these preconditioners is their lack of robustness near stagnation points. The 
reason for this can be traced [5] to a lack of stabilization caused by the eigenvectors of the artificial 
dissipation matrix becoming nearly parallel. 

The formulation of numerical algorithms for the compressible flow equations employing sym- 
metrizing, or entropy, variables has been advocated by several authors e.g. [13, 2]. One of the claims 
often made is that discrete schemes formulated in entropy variables inherit global entropy stability 
properties of the original equations. Gustafsson [10] points out that, for a hyperbolic system of 
equations, the higher the degree of unsymmetry, as measured by the condition number of the trans- 
formation required to symmetrize the system, the lesser the well-posedness of the problem, in the 
sense that perturbations of the initial data influence the solution more. As it turns out, the com- 
pressible Euler equations formulated in terms of either primitive or conservative variables become 
increasingly unsymmetric when the reference Mach number goes to zero. From our perspective, 
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one of the main attractive features derived from the use of entropy variables is that the dissipation 
operator remains symmetric. This means that a full set of orthogonal eigenvectors can always be 
found and therefore the stabilization terms remain effective throughout the computational doamin. 
In this respect, the combination of the preconditioning ideas presented in the finite volume context 
to deal with low Mach number flows combined with a formulation in entropy variables seems very 
appealing. 

Here, we propose a specific construction of the stabilization matrix r for a finite element SUPG 
formulation of the steady state Euler equations. The development of such stabilization matrix 
incorporates low Mach number preconditioning concepts, previously developed in the finite volume 
context. The ideas presented here extend to the time dependent and the Navier-Stokes equations 
in a straightforward manner. The equations are first transformed into a new set of variables which, 
in the low Mach number limit, can be easily related to the incompressible velocity and pressure. 
A necessary condition for stability, when the Mach number tends to zero, is obtained by requiring 
that the asymptotic behavior of the stabilization terms matches that of the terms in the Euler 
equations. This requirement results in some additional constraints on r which are not typically 
satisfied by the classical choices of r. The proposed algorithm combines the very attractive features 
of the stabilized finite element formulations with the ability to produce accurate answers over a 
very large range of Mach numbers. 

We start with a brief description of the SUPG formulation for the Euler equations using entropy 
variables. For simplicity of presentation we will consider the two dimensional case, but results 
herein presented extend directly to three dimensions and are also applicable to the Navier-Stokes 
equations. Next, we discuss the low Mach number scaling arguments which lead to the proposed 
form of stabilization matrix t and outline our numerical implementation. Finally, numerical tesults 
using linear elements are presented for the flow over isolated cylinders and airfoils with free stream 
Mach numbers ranging from 10 — ^ to moderate subsonic values. In the next section, we present 
results for transonic flows using higher order elements. 
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2.2 Compressible flow governing equations 
2.2.1 Conservation variables 

We start from the time dependent two dimensional compressible Euler equations in conservation 
form 

U, * + F u + F 2 , 2 = 0, (1) 


where 



p 


pu 1 


PU 2 

u = 

pu 1 

, F, = 

pu\ +p 

, f 2 = 

pu \U2 


PU 2 


pu \U 2 


put, + p 


. P E . 


_ u\(pE +p) _ 


_ u 2 (pE + p) _ 


In the above expressions, p is the density; u = [u\,U 2 ] T is the velocity vector; E is the specific 
total energy; p is the pressure; and the comma denotes partial differentiation (e.g. Uj d\J /dt, 
the partial derivative with respect to time, F ld = dF l /dx j , the partial derivative with respect to 
the j-th spatial coordinate). The system of equations is closed once the pressure is related to the 
problem variables through the equation of state, p — y — 1 )pc, where e = E— |u| /2, is the internal 
energy. Here, is the ratio of specific heats which is assumed to be constant. 

We will assume that all the above quantities have been non-dimensionalized using reference, 
or free stream, values for density p* , velocity u*, and length L. Thus, the dimensional variables, 
denoted with an overbar, are related to the non-dimensional variables introduced above as 
E 


Uj 


P = 


Ui = — , i = 1,2, 
u* 


p = 


P 


_ x i ■ , c. , , t 

E=— , Xi = —,i = 1,2, and t = — . 


p* U* P*u% u% 

Finally, we introduce the reference speed of sound c*, and define, for later use, a reference Mach 
number as e = u»/c* . 

We note that the equation system (1) can be written as 


U, t + AiUj + A 2 U, 2 — 0, 


( 2 ) 


where the Jacobian matrices A, = F 2)U) * = 1,2, are unsymmetric but have real eigenvalues and a 
complete set of eigenvectors. 
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2.2.2 Entropy variables 

We seek a new set of variables V, called entropy variables, such that the change U = U(V) applied 
to ( 1 ) give the transformed system 

U(V) |t + F 1 (V),i+F 2 (V ), 2 = 0, (3) 

where A 0 = U,v is symmetric positive definite, and A t = A,A 0 = F^v, * = 1 , 2 , are symmetric. 

Following [12], we introduce a scalar entropy function H(V) — —pg{s), where s is the non- 
dimensional entropy s = In (p/p^). The required change of variables is obtained by taking 

e(7 - a/d') ~ M 2 / 2 

u 1 

u 2 
-1 

The conditions Q r > 0 and 9^ 1 9^ ^7 ^ , ensure that //(X J) is a convex function and therefore 
A~* = V u = H uU) and Ao, are symmetric positive definite. We consider in [ 1 ] the variables 
resulting from two particular choices of the function g(s). The system (3), can thus be written in 
symmetric quasi- linear form as 

A 0 V, t + AiV,x + A 2 V , 2 = 0 . (5) 

Barth [2] noted that it is always possible to construct matrices R n i = 1,2 whose columns are 
the scaled right eigenvectors of A,, i = 1,2 respectively, such that 

A 0 = RiRf = R 2 Rl , ( 6 ) 

and 

Aj = Rj AjR," 1 , Aj = RjAiR-f , fori = 1 , 2 . (7) 

An explicit expression for R,, i = 1,2, is given in [1]. 
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2.3 Low Mach number limit 


The low Mach number limit of the Euler equations is best studied by rewriting the equations in 
terms of the variables (p,u\,U 2 ,s). This yields the following system 


l_ dj)_ 


pc dt 


du\ 


dt 

+ 

dU2 


dt 


ds 


. dt . 

L 


u\ c 0 0 

c u\ 0 0 

0 0 ui 0 

0 0 0 ui 

In matrix form, this system can be written as 


1 dp 
pc dx\ 

du\ 

dx\ 


Du? 

<9xi 

Os 

<9xi j 


+ 


U-2 0 c 0 

0 u 2 0 0 

C 0 U2 0 

0 0 0 u 2 



J. J 2. 

pc dx 2 


0 


du\ 
dx 2 


0 


du2 
dx 2 


0 


i 

f|s> 


0 


( 8 ) 


(9) 


Z, t + CiZ,i + C 2 Z, 2 = 0, 

where the matrices Ci and C 2 are symmetric and dZ = [d(ft,dui,du 2 ,ds] T with d(f> = If we 
assume that the flow is isentropic, which is certainly satisfied for low Mach numbers when the free 
stream is isentropic, we can express c and p as a function of p, i.e, 


P 1 

P = — 2 

7e- : 


and 


,(7-l)/2 


c = 


to obtain [9], 


<t> = 


2 

(7 - !)e 


(pb- 1 )/ 2 


1) 




€ 


In terms of (f>, u\ and u 2 , the continuity and momentum isentropic compressible flow equations 


become 


0,t + ^10,1 + 


— ( f) + ~ ) «1,1 + U 2 <f>,2 + 


+ “ ) U 2,2 = 


«l,t + 


— r — 0 + - ) 0,1 + + ^2^1,2 = 0, 


(10) 


U 2j t + ^ 2 - 0 + - J 0,2 + ^2^2,2 - 0. 

Note that, in assuming isentropic flow, we have reduced the number of dependent variables by 
eliminating entropy from the system of governing equations. Equations (10), look now similar to 
the incompressible flow equations 


& 1,1 + U 2,2 — 0 


v>i } t +p y 1 + ni^i,i + ^2^1,2 — 0 (H) 

U 2 ,t + ^ 1 ^ 2,1 + P, 2 + ^ 2 ^ 2,2 = 0 , 
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where u = [ui,U 2 ] t , denotes the velocity and p the pressure. When e — > 0, the equivalence 
between (10) and (11), under some regularity assumptions on the initial and boundary data, can be 
rigorously shown [11]. In particular, we have that for e — > 0, u t — > Ui and ((7 — l)4>/2 + 1/e) 4> t i 
■pi for i = 1,2. To obtain bounded derivatives when e — * 0, it follows that (j\i and «i ( i + W 2/2 must 
be 0(e). We also note that the velocity components u l and its spatial derivatives u h] ,i,j = 1,2 are 
0(1). Finally, it can be shown that <f> is O(e) and that </>/e is well defined in the limit e -> 0. 

We now turn our attention to the steady state Euler equations for isentropic flow and derive 
some asymptotic estimates which are valid in the low Mach number limit. The equations for this 
reduced problem are, 


C{zfj + C£zf 2 = 0 (12) 


z 1 = 

1 

1 

, c{ = 

1 

£ Ci 

0 0 

1 

11 

O 

U2 0 C 
0 U 2 0 


. U2 . 


1 

0 

0 

1 


C 0 U 2 


Using the above estimates we have that, for e — > 0, 



' 0(e) ' 


0(1) 

0(e -1 ) 

0 


0(1) 

0 

0(e~ l ) ' 

z\ ~ rs ' 

0(1) 


o(c-M 

0(1) 

0 


0 

0(1) 

0 


. 0(i) . 


0 

0 

0(1) _ 


. 0(e~ l ) 

0 

0(1) 


(13) 


Thus, for the individual terms of the isentropic Euler equations, 


c{zf x ~ c! 2 z' 2 ~ 


0 (£ - 1 ) 
0 ( 1 ) 
0 ( 1 ) 


as e —> 0. 


2.4 Variational formulation for the steady state problem 


(14) 


We now consider the compressible steady problem in conservation form expressed in terms of 
symmetrizing variables. The conservative form of the equations is taken to be the starting point 
because we are ultimately interested in an algorithm that can be used over the whole range of 
speed regimes, including situations were the solution may contain discontinuitites. The problem is 
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defined in a domain 0 with boundary V by 


(15) 

(16) 
(17) 


Fi(V),i +F 2 (V), 2 =0 in ft, 

A-v = A-g on r\r a , 
u ■ n = 0 on T a . 

For simplicity, the domain boundary is assumed to be made up of an impermeable solid wall r a , 
and a computational far field boundary r\r a . In (16, 17), n = [n\,ri 2 ] T is fhe outward unit normal 
vector to T, and A n = A n Ao, A n = A\n\ + A 2 U. 2 . Finally, A„ — A n Ao, and A n denotes the 
negative definite part of A n . 

Let the spatial domain be discretized into non-overlapping elements T e , such that ft = \JT e , 
and T e f| T e > = 0, e ^ e'. We consider the space of functions V/*, defined over the discretization and 
consisting of the continuous functions which are piecewise linear over each element 

V* = {W | W G (C°(ft))\ W|r e 6 (Vi (T e ))\ VT e G ft}. 

The SUPG algorithm can then be written as: Find V h G V h such that for all W G V\ 

B(V h , W) gal + B{V hl W) supg + B(V h ,W) bc = 0, (18) 

where the forms B(-,-) ga h B(‘^) sup g and B(',-) bc account for the Galerkin, SUPG stabilization, 
and boundary condition terms respectively, and are defined as 


B(V, W) gal = f (-W.r-F^V) 

JQ 

- W i 2 -F 2 (V))dft, 

(19) 

B(V, W) supg = f ( Ai W iX + A 2 W, 2 ) 
Jn 

■ r (AiV,! + A 2 V, 2 ) rfft, 

(20) 

B(V, W) 6c = [ W-F^Vjn )ds + 
Jr a J 

[ W • Fy/(V , g; n) ds. 
'r\r„ 

(21) 


where r is the stabilization matrix. The numerical flux function on the impermeable wall boundary 
F w , is simply [ 0 ,prci,pri 2 , 0] T while the numerical flux function on the far field boundary F //, is 
defined by 

F ff (V— ? V+ ! n) = 1(F„(V_) + F„(V+)) - l|A n (V Roe (V_, V + ))|(U(V+) - U(V_)). 

Here, |A„(V)| = A+(V) - A“(V) is the absolute value of A n evaluated at V, and V Roe (V + , V_), 
is the Roe average [19], between the states V + and V - . 
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Figure 3: Mapping between the master triangle T e and a typical element T e . 

2.4.1 Standard definitions for r 

In order to uniquely specify the SUPG algorithm (18), an appropriate expression for the stabilization 
matrix r needs to be given. We recall that an appropriate r should be symmetric, positive definite, 
have dimensions of time, and scale linearly with the element size [14]. 

For a triangle T e , we introduce the mapping x = x(£), between the master triangle T e , in the 
parametric space £ = [£i,£ 2 ]^\ and T e 6 in the mapped space x = [xj,x 2 ] ) as illustrated in 
figure 3. We define 

B i = ( x 2^ 2 Ai -zi,£ 2 A 2 )/J, 

B '2 = A 2 — # 2 ,£i Ai)/J, 

B 3 = — a; 2 ,^ 2 )Al + (xi ,{ 2 — 2 : 1 ,Ci)A 2 )/J, 

and the Jacobian of the mapping, J = x\ £^2 — x 2£i Xl ,b‘ ^'^ ie f°H° w i n g definition for t can be 
shown to satisfy all of the above requirements 

r = Ao" 1 (|Bi| p + |B 2 | p + |B 3 | p )^ 1/p - (22) 

The most common choice for p has been 2, which necessitates the evaluation of a matrix square 
root. This choice has been advocated by Shakib et al. [20], whereas Barth [2], has shown that the 
choice p- 1 is computationally advantageous and, in practice, gives very similar results. 

These choices of stabilization matrix, work well provided that the flow Mach numberis not too 
low. For problems involving significant regions of low Mach number flow the solution degrades 
and becomes less and less accurate as the Mach number is decreased. In the following sections we 
will address this issue by first, identifying the source of the problem, and second, devising a new 
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formulation for r which does not suffer from this drawback and which leads to a formulation which 
can handle very low Mach numbers accurately. 

2.5 A stabilized formulation for isentropic flows 

Turkel et al. [23] have employed a scaling analysis to determine the appropriate low Mach number 
behavior of local preconditioners. In their application, the local preconditioner modifies the upwind 
dissipation of the underlying finite volume or finite difference discretization. In this section, we will 
consider a stabilized SUPG formulation applied direclty to the isentropic equations (12). We will 
extend the analysis presented in [23] to our finite element formulation and, in particular, derive the 
appropriate scaling for the stabilization matrix in the limit of vanishing Mach number. 
Introducing the finite element space, 

Vi = {w 7 1 w 7 e (C°(ft)) 3 , w'h e (Pi(T e )) 3 , VT e e ft}. 

we consider the following SUPG formulation : Find Zj[ € V i such that for all W 7 €E V^, 

f? 7 (Z', + B\ZI W ! ) supg + B J {Zl W 7 ) bc = 0, (23) 


B 1 (Z 1 ,W f ) ga i = [ (W 7 • (C 7 Zf, + C 7 Zf 2 )) dft, (24) 

Jn 

where, 

B’{ Z 7 , W 7 ) sup9 = [ (CfW 7 ! + C 7 Wf 2 ) ■ r 7 (C 7 Zf! + C 7 Zf 2 ) dft, (25) 

Jn 

and B ! ( •)(,,. is a term incorporating the desired boundary conditions. Note that here, the Galerkin 
term (24) has been integrated by parts, thus, B l ( •, -)b c also includes the additional boundary terms. 

Our desire is to construct a stabilized finite element method which admits solutions, Z ! h , with 
the same low Mach number asymptotic behavior as the solutions of the isentropic Euler equations, 
Z 7 . In addition, we require that the stabilization operator (25) must scale like the Galerkin and 
boundary operators (24) as the Mach number is reduced. Specifically, these requirements imply 
that 


c(z 7 i1 + c 7 z 7 i2 ~ 


0(e- 7 ) 

0 ( 1 ) 

0 ( 1 ) 


for e — > 0, 


(26) 
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and that 


c' ■ f‘ 


(C{Z M + C^Z M )~C{zi il + C{z( il , for i — 1,2, e 


(27) 


Expressions (26) and (27), provide an additional condition, on the asymptotic behavior of the 
components of the stabilization matrix t 1 that should be satisfied for e — > 0. This condition can 
be written as, 


0(1) 0(e~') 0(e~ 1 )' 


Til T 12 n 3 


’ 0(e- x ) ' 


’ 0(e _1 ) ' 

0(e _1 ) 0(1) 0 


Tl2 t 22 T23 


0(1) 


0(1) 

| 

o 

o 

7 

UJ 

o 

1 


Tl 3 T23 T33 


0(1) 


0(1) 


2.5.1 Asymptotic behavior of r using standard definitions 

We consider now the standard definition of the stabilization matrix, given in (22), applied to 
our simplified problem (23), and show that it fails to satisfy condition (28). Specifically, for one 
dimensional problems, the standard stabilization (22) is, for p = 1, 

f[ d = k \C[ d \ x , 


where k, is a constant proportional to the element size. Based on the above assumed behavior of 
the solution, it can be shown that for e 0, 


0(e) 0(e 2 ) 

0(e 2 ) O(e) 


(29) 


It can easily be verified that this stabilization matrix, does not satisfy the condition (28), which in 
the one dimensional case is 

! ' 0(1) 0(e- J ) 

_ 0(e- x ) 0(1) 

Therefore, this form of f 1 , will fail to provide adequate stabilization when e -» 0. It can also be 
shown that, in the multi-dimensional case, the standard choice of stabilization matrix (22), does 
not have the appropriate asymptotic behavior for e — > 0, either. 

When working directly with entropy, or conservative variables, the analysis is more complicated 
because the link between these variables and the incompressible velocity and pressure is less trans- 
parent. It is observed in practice that the standard stabilization scheme (22), when used with the 
equations written in terms of entropy variables, produce solutions which deteriorate severely when 


Til T 12 

Tl2 T22 



1 

7 

UJ 

0 

1 


1 

0 

n\ 

1 

1 


0(1) 


O(i) J 


(30) 
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the Mach number is reduced. In [8], the use of the standard finite volume upwind scheme, using 
Roe’s dissipation, is shown to give in the incompressible limit, solutions in which the pressure vari- 
ations do not scale like the Mach number squared. This first order finite volume upwind scheme, 
can be thought of, at least in one dimension, as the result of using a stabilization matrix of the 
form (22), directly with conservative variables. 

2.6 Alternative definition of r for low Mach number flow 

Our objective is to derive a stabilization matrix r, for the variational formulation in entropy 
variables (18). An approach which has been advocated in the design of low Mach preconditioners 
for finite volume schemes [23, 6], has been to derive an appropriate stabilization matrix t, for the 
system (8), and then transform it to conservative variables. 

It is apparent that, even with the additional constraint given by condition (28), the stabilization 
matrix is not uniquely defined and some freedom is still available in constructing it. In addition 
to the requirements placed on the construction of the standard forms of t, i.e. (22), we place the 
following three constraints: 

i) the entropy equation, which decouples from the other equations, is stabilized as an indepen- 
dent quantity convected with the velocity. 

H ) the vorticity equation, which, in the incompressible limit, decouples from the other equations 
after taking the curl of the velocity evolution equations, is stabilized as an independent 
quantity convected with the velocity. 

in) the resulting algorithm has the correct low Mach number scaling as discussed in the previous 
section. 

In order to incorporate the above conditions, we consider the Euler equation corresponding to the 
variational formulation (23), augmented with the entropy equation, 

CjZ i + C 2 Z 2 = (Cif'(CiZ i i + C 2 Z 2 )) j + (C^CiZ,! + C2Z 1 2)) 2 • (31) 

We shall further assume that, only for the purpose of deriving r, the matrices Ci and C 2 are locally 
constant and therefore the above system of equations can be treated as if it was linear. 

Requirement i) forces the last row and column of r to be zero, except for the diagonal entry 
which corresponds to a velocity time scale. Requirement vi). implies that the rest of the matrix 
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must also be of diagonal form. The entries in the second and third rows must be equal and are also 
associated with a velocity time scale. Therefore, we find that f, for the complete system must be 
of the form 

a 0 0 0 

0 6 0 0 
0 0 6 0 
0 0 0 6 

where 6 = l/|u| and h e is the size of the element. In order to satisfy the low Mach number scaling, 
it is clear that 6 ~ 0(1), and therefore a must be 0(e 2 ). The simplest choice, having the right 
dimensions, is o = |u|/c 2 . For very high speed flows, the stabilization should transition to a pure 
streamwise upwinding such that a — > l/|u|. In practice, the specific definition of t which we use 
is. 





Using these definitions, the acoustic timescale, r a , has the correct low Mach number behavior 
required by the asymptotic analysis. Also, r Q behaves appropriately for large local Mach numbers 
where it returns to the convective timescale r c . 

Finally, the required form of r can be obtained by transforming the system (31) to entropy 
variables. In principle, this could be accomplished by using the transformation matrix S = V, Z - 
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Inserting dZ = S ' x dV into (31) and multiplying through by S 7 , gives, 

S -r CiS -1 V,! + S -t C 2 S -1 V )2 = (S- 7 'C 1 S- l SrS r (S- r C 1 S- 1 V, i + S- T C 2 S~ l V- 2 )) A + 

(S- T C 2 S- 1 SrS T (S- T C 1 S- 1 V ! + s- T c 2 s- l v a )) 2 . 

(34) 

Where, the transformed matrix t, can be readily identified as 


r = S tS t . 


(35) 


We note that the transformed matrices S _T CjS _1 ,i = 1,2, are not equal to A = 1,2 and indeed 
S“ T S _1 is not be equal to A 0 . This is due to the fact that the entropy equation in (8), completely 
decouples from the rest of the system. Therefore, one can multiply the first three components of 
dZ, i.e. dp/ pc,dui,dv, 2 , by a scalar factor, and the fourth component of dZ, i.e. ds, by different 
scalar factor without changing the jacobian matrices C i,i = 1,2, or r, in (31). It is not hard to 
find the scalar factors that one should use to define modified dZ variables so that the resulting 
S matrix would give S -T C;S -1 = A t ,i = 1,2. An alternative procedure for evaluating S, which 
avoids the use of modified variables is given in [1]. Once the appropriate transformation matrix 
has been evaluated, r is found using expression (35). 


2.7 Numerical results 

In this section we present some numerical results that illustrate the performance of the proposed 
algorithm. For all the examples, we have solved problem (18), which is expressed in terms of 
entropy variables. The non-linear set of equations resulting from the discretization of (18) is solved 
by a Newton- Raphson iteration using exact linearization. For some of the simulations however, it 
was necessary to damp the Newton-Raphson iteration during the first few iterations. The solution 
of the non-symmetric non-linear system of equations required for each iteration was solved using 
an enhanced BiCGstab(2) algorithm [7, 21] together with a block ILU(fc) preconditioning, with k 
either 0 or 1. For all the examples we have considered the two choices of entropy variables given in 

[1] , and we have not found any appreciable differences in the computed results. We have followed 

[2] , and employed a linear representation of the solution over each triangular element, but have 
used a quadratic mapping to more accurately represent the geometry. We have found that using 
quadratic interpolation of the geometry greatly improves the quality of the solution and only incurs 
a minimal incremental cost. 
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2.7.1 Example 1: Flow over cylinder 

In this example, we consider the flow about a cylinder and perform several numerical tests. Because 
of the symmetry of the problem at the low Mach numbers considered here, only half of the domain 
is considered for the solution. We have used a sequence of three meshes of triangular elements 
containing 1271, 4941, and 19481 nodes, respectively, to discretize to computational domain. The 
medium and fine meshes are obtained by uniformly subdividing the coarse mesh. Figure 4 shows a 
detail of the medium size mesh near the cylinder. 

In the first test, we consider the SUPG algorithm (18), with the standard choice of r given by 
(22) for p = 1, and solve for the flow about the cylinder on the coarse mesh for free stream Mach 
numbers of 0.38, 0.1 and 0.01. Figure 5 shows the pressure contours for the three solutions. The 
degradation in the solution accuracy when the Mach number is reduced is clearly apparent. We 
could not obtain any numerical solutions below a free stream Mach number of 0.01, and for this 
Mach number, the iteration would only converge if a damped Newton-Raphson iteration was used. 
For comparison purposes, we show in figure 6 the pressure contours obtained, for the same flow 
conditions and mesh, when the proposed form of r, given in (35), is used. No qualitative degradation 
of the solution is observed and in fact, we observe that, the solutions for Mach numbers of 0.1 and 
0.01 look almost identical, as expected. 

In the second test, we perform a mesh convergence analysis at various Mach numbers. Figure 
(7) shows the Mach number contours computed on the coarse, medium and fine mesh for a free 
stream Mach number of 0.38. Analogous results, but now for a Mach number of 0.001 are shown in 
figure (8). The qualitative behavior of the solution, at both Mach numbers, does not present any 
anomalies. In figure (9), a plot of the solution error norm versus grid size is shown. Here, solutions 
at Mach numbers of 0.38, 0.01, and 0.001 are compared and second order convergence is observed 
in all cases, showing no deterioration in the convergence rate as the Mach number is reduced. The 
norm of the solution error considered was [/ Q (V - V /i )A 0 (V - \ h )dQ ] 1/2 , where V, is a reference 
solution computed using a quadratic approximation on a highly refined grid of 77361 nodes [26]. 

2.7.2 Example 2: Flow over an airfoil 

In this example, the proposed scheme was used to simulate the flow over NACA 0012 airfoil at 
Mach numbers of 0.01 and 0.6, and at an angle of attack of 2°. An unstructured triangular mesh 
of 19948 nodes was used for the simulation. Figure 10 shows a coarser mesh from which the 
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19948 node mesh can be obtained by uniformly subdividing each element into four elements. The 
computed Mach number and pressure contours for the two flow conditions are shown in figures 11 
and 12, respectively. The qualitative behavior of the solution looks excellent, and again, no solution 
accuracy or numerical anomalies are observed when very low Mach number flows are considered. 



Figure 4: Detail of the medium size mesh used for computations 


3 High Order Finite Element Discretization of the Compressible 
Euler and Navier-Stokes Equations 

This section centers upon a high-order accurate, stabilized, finite element method for the numerical 
solution of the compressible Euler and Navier-Stokes equations. The SUPG finite element method 
for compressible flow simulations was initially developed and analyzed by Hughes et al. [14, 15, 
13, 20] and has since gained significant popularity. Its relation to multidimensional upwinding was 
elucidated in [28, 29] and higher order implementations for inviscid flows were presented in [2]. 
In [1], the SUPG algorithm was extentded to cover the simulation of near-incompressible flows 
by employing a stabilization matrix which exhibits proper scaling over the entire range of Mach 
numbers. Here, we focus on the higher order implementation of the algorithm developed in [1] for 
invicid and viscous flows. 
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Figure 5: Static pressure contours computed on the coarse mesh using the SUPG algorithm with 
the standard choice of stabilization matrix given by (22), for a free stream Mach number of a) 0.01, 
b) 0.1, and c) 0.38. 
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Figure 6: Static pressure contours computed on the coarse mesh using the SUPG algorithm with 
the proposed choice of stabilization matrix given by (35), for a free stream Mach number of a) 0.01, 
b) 0.1, and c) 0.38. 
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Figure 7: Mach contours for a free stream Mach number of 0.38 computed on the : a) coarse, b) 
medium, and c) fine meshes. 
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Figure 8: Mach contours for a free stream Mach number of 0.001 computed on the : a) coarse, b) 
medium, and c) fine meshes. 
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Figure 9: Grid convergence plot showing the solution error norm versus the grid size parameter 
using the proposed algorithm and for free stream Mach numbers of 0.001, 0.01 and 0.38. 



Figure 10: Detail of the unstructured triangular the mesh used for the airfoil computations 
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Figure 12: Static pressure contours for the solution of the flow over NACA 0012 at an angle of 
attack of 2° for a free stream Mach number of : a) 0.01, and b) 0.6. 
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3.1 Compressible flow governing equations 

We start from the time dependent two dimensional compressible Euler equations in conservation 
form 


U, t + (F - F u ) m + (F - F”) 2 ,2 = 0, 


( 36 ) 


where 


and 
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In the above expressions, p is the density; u = [ui,u 2 ] T is the velocity vector; E is the specific 
total energy; p is the pressure; and the comma denotes partial differentiation (e.g. Uy = dU/dt, 
the partial derivative with respect to time, F itj = dF l /dx J , the partial derivative with respect 
to the j-th spatial coordinate). The system of equations is closed once the pressure is related to 
the problem variables through the equation of state, p = (7 — 1 )pe, where e = E - |u| 2 / 2 , is the 
internal energy. Here, 'y is the ratio of specific heats and p is the absolute viscosity, both of which 
are assumed to be constant. Following the usual assumptions: 
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We will assume that all the above quantities have been non-dimensionalized using reference, or free 
stream, values for density p , . velocity »* , and length L. Thus, the dimensional variables, denoted 
with an overbar, are related to the non-dimensional variables introduced above as 


P = 


P_ 

P* 


Ui — 


Ui_ 

U* 


i = 1,2, 



E - 


E 


uz 




i = 1,2, 


and 



We note that the equation system (36) can be written as 


U, , + AiU i + A 2 U 2 = (KnU i),i + (K 12 U 2 ),i + (K 2 iU, 0,2 + (K 22 U >2 ), 2 , (37) 

where the Jacobian matrices A, = Fj,u, * = 1, 2, are unsymmetric but have real eigenvalues and a 
complete set of eigenvectors. K ij = F ' y are the viscous flux jacobians. The above equation may 
be symmetrized through a change of variables, for details, we refer the reader to [13, 12]. 


3.2 Variational formulation for the steady state problem 

We now consider the compressible steady problem in conservation form expressed in terms of 
symmetrizing variables. The conservative form of the equations is taken to be the starting point 
because we are ultimately interested in an algorithm that can be used over the whole range of 
speed regimes, including situations were the solution may contain discontinuitites. The problem is 
defined in a domain 0 with boundary T by 

(F + F v )i(V),i + (F + F , ') 2 (V), 2 = 0 in ft, (38) 

A“V = A“g on T\r a , (39) 

F u • n = f on T a (40) 

For simplicity, the domain boundary is assumed to be made up of a solid wall I a , and a computa- 
tional far field boundary T\r a . In (39, 40), n = is the outward unit normal vector to T, 

and A n = A„A 0 , A n = Aini+A 2 n 2 . Finally, A~ = A“A 0 , and A“ denotes the negative definite 
part of A n . 

Let the spatial domain fi, be discretized into non-overlapping elements T e , such that (l = (JT e , 
and T e f]Tp = 0, e ^ e'. We consider the space of functions V h , defined over the discretization and 
consisting of the continuous functions which are piecewise linear over each element 

V* = {w | w e (C°(ft)) 4 , W|r e G {v k {T e ))\ VT e G n}. 
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( 41 ) 


The SUPG algorithm can then be written as: Find Y h eV h such that for all W 6 V /l , 

B(V h ,W) gal + B(V h ,W) supg + B(Y h , W) 6c = 0, 

where the forms B{-,-) ga i, B{-,-) supg and B(-,-) bc account for the Galerkin, SUPG stabilization, 
and boundary condition terms respectively, and are defined as 

B(V.WL= f(-W, 1 -(F-r) 1 (V)-W, 2 -(F-F") 2 (V))(iO, (42) 

Jq 

B(V , W) supg = f ( Ai W,i + A 2 W, 2 ) • t (Ai V,i + A 2 V 2 - (KnV,i),i - (Ki 2 V >2 ),i -(43) 
(K 2 iV j i) )2 — (K 22 V, 2 ), 2 ) dfl, 

and 

B(V,W) bc = f W-(F// + F w )(V,g;n)<fa.+ f W ■ F^V, f; n) ds. (44) 

Mu. JVa 

where r is the stabilization matrix. The numerical flux function on the far field boundary Fyy, is 
defined by 

F„(V_, V+| n) = i(F„<V_) + F n (V+)) - i|A„(V(V_, V + ))|(V+ - V_). 

Here, |A„(V)| = A+(V) - A _ (V) is the absolute value of A n evaluated at V*, and V*(V+, V_), 
is the an average between the states V + and V The average state, V*, is chosen to ensure 
the global stability of the algorithm [2]. The acquisition of V* requires iteration and in practice 
an arithmetic average may be used. The Roe flux [19] was used in all the numerical simulations 
presented herein. For inviscid compilations, the viscous terms in the expressions above would of 
course, vanish. For viscous simulations, Dirichlet boundary conditions may replace portions of the 
boundary integral. 

3.3 Numerical results 

In this section we present some numerical results that illustrate the performance of the pioposed 
algorithm. Test problems were solved employing both linear and quadratic element approximations. 
For comparative purposes, the meshes used for all linear element approximations were obtained by 
subdividing each element of the corresponding quadratic element mesh into foui linear elements. 
In this way, comparisons between P\ and P 2 solutions involving the same number of nodes can 
be made. h c thus represents the distance between two nodes in the meshes used in the numerical 
simulations presented herein. 
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3.3.1 Example 1: Rinleb flow 

In this example, we consider a ringleb test case (an exact solution of the Euler equations, [27]). 
The error is computed in the L 2 entropy norm [1]. Both P\ and P 2 element approximation achieved 
their respective optimal convergence rate of 0 (/t 2 ) and 0(f i 3 ) respectively, as can be seen in figure 
13. 

3.3.2 Example 2: Flow over an airfoil 

In this example, the proposed scheme was used to simulate the flow over NACA 0012 airfoil at a 
Mach number of 0.6, and at an angle of attack of 2°. In figure 14, the L 2 entropy deviation for 
both P\ and P 2 simulations are presented. The quadratic element approximation results in a much 
lower level of entropy error than its linear element counterpart. The geometric singularity at the 
trailing edge of the airfoil requires a much finer discretization around that point relative to the 
rest of the mesh for both the linear and quadratic element approximations to achieve their optimal 
convegence rate. This is particularly important for the P 2 approximation since the error away from 
the geometric singularity vanishes far quicker, rendering the trailing edge error as the dominant 
source of error for the numerical approximation. Figure 15, shows the computed Mach contours 
for the pi and P 2 simulations. 

3.3.3 Example 3: Flow over flat plate 

In this example, we consider flow over a flat plate of unit length. The computational domain is 
[-1.5, 1] x [0, 1], with the leading edge of the plate at (0,0). The free stream Mach number is 0.5. 
The Reynolds number is raised from 8000 to 64000 in successive simulations while keeping the mesh 
unchanged. The results in the form of boundary layer thickness, ^ 99 {x = A), are plotted in figure 16. 
The quadratic element approximation yields results very close to that of the Blasius solution while 
the linear element approximation shows increasing error with rising Reynolds number. Further 
numerical tests have shown that it is possible to resolve the Blasius boundary layer with only two 
elements when quadratic element approximation is used. 

3.3.4 Example 4: Transonic computations 

Here we show the application of the algorithm proposed using P 2 interpolations to the inviscid 
computation of the transonic flow over a NACA0012 airfoil at a mach number of 0.8 and and angle 
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of attack of 1.2° degrees 17. The shock capturing term employed is that reported in [2]. This 
results are only preliminary but illustrate the ability of the high order finite element algorithm to 
capture shocks. 

4 An Algebraic Multigrid Method for Convection-Difsusion Flows 

Rapid advances in unstructured mesh methods for computational fluid dynamics (CFD) have been 
made in recent years and, for the computation of inviscid flows, have achieved a considerable level 
of maturity. Viscous flow technology is also rapidly developing and the use of unstructured grids 
has been indispensable. Unstructured meshes offer a practical means for computation and have 
the advantages that they provide both flexible approximations of the domain geometry and easy 
adaptation/refinement of the mesh. 

Accurate and efficient solutions to the compressible Navier-Stokes equations, especially in the 
turbulent high Reynolds number limit, remains one of the most challenging problems due to the 
myriad of associated length scales required to properly resolve flow features. This is especially 
true in the boundary layer regions where severe grid anisotropy is required. Discretization of the 
partial differential equations on the mesh gives rise to a large linear system of equations. For 3D 
problems, these large discrete problems often cannot be solved using direct solution methods. As 
a result, iterative solution methods based on Krylov subspace methods and/or multilevel methods, 
which include multigrid and domain decomposition methods, are attractive. Multilevel methods 
can often provide mesh independent convergence rates [31] and offer good scaling of the compute 
time as well as data storage requirements. In the typical context, multilevel methods are not used 
as solvers but as preconditioners for Krylov subspace iterative solvers. This provides a powerful 
and flexible framework for computation. 

There are two major problems associated with AMG for the solution of convection diffusion 
flows. The first is the definition of accurate coarse spaces in the multilevel construction. This 
is required to properly capture the behaviour of the discretized equations on these coarse spaces. 
The second problem is the behaviour of the smoothing operators with high Reynolds number 
and anisotropic effects. These two effects are typically the leading causes of convergence rate 
deterioration. 

In this section, we present a multigrid methodology for the solution of convection-diffusion based 
problems, especially in the finite element context. The target application for this algorithm is high 
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Ringleb Flow 
















Mesh - Flow Over Flat Plate 



Figure 16: Top: Computed value of boundary layer thickness at x = 1, Bottom: Computational 
mesh. 
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Reynolds number Navier-Stokes flows. 


4.1 Model problem 

The model problem considered is the linear convection diffusion problem 


U -V<£ = 

vV 2 <t> + f 

in fi, 

(45) 

<t> = 

OiD 

on dtlo, 

(46) 

dcj) 

dr] 

&N 

on 

(47) 

where fi is a bounded domain in lR n (n = 

1,2,3) with boundary and U = 

, . , U n ) is an 


incompressible prescribed velocity field. The finite element discretization is based on the stabilized 
SUPG formulation analogous to that described in the previous sections. Introducing the variational 
form of Eq. 45, the problem reduces to finding <p € Hg(f2;dfiD) such that 

afav) = F(v) (48) 

where 


{(f), v) = 

[ {j(U-V0) dSl+ f vV(f)Vv dn 

(49) 


Jn Jn 


F( v) = 

[ fvdx 

(50) 


Jn 


V = 

v + rU ■ Vi> 

(51) 


H ( ')(4b c4b)) is the Sobolev space which contains functions that vanish on dfi/? with square 
integrable first derivatives, v is the stabilized SUPG test function and r is the SUPG stabilization 
parameter. The discretization of the problem is done by covering SI with non-overlapping finite 
elements through a triangulation and defining standard linear basis functions over these elements. 
The discrete problem now reduces to: 

Find (f>h € V), such that 

a(</>/,, Vh) = F(vh) Vvh G Vh (52) 

where V h is the finite dimensional subspace of Hj(0; dU D ) consisting of continuous functions which 
are linear over the elements. This results in a system of linear equations 

A<fi — b (53) 
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which need to be solved for the discrete solution (fih- 

In order to obtain accurate results which provide detailed resolution of the flow field, a well 
resolved mesh is required which gives rise to a large linear system. In general, iterative methods give 
very good performance when the underlying matrix A is symmetric and positive definite (SPD). 
However, the resulting matrices from the discretization of convection-diffusion problems are not 
SPD. This is usually due to the anisotropic stiffness introduced in the system of equations through 
the grid and the convective nature of the system. 

Here, the development of a multilevel methodology based on algebraic multigrid is discussed. 
Multigrid has shown great promise in the solution of linear algebraic systems and can be shown 
to have mesh independent convergence for a wide range of problems. Iterative schemes remove 
certain types of errors, typically high frequency (rough) errors but are unable to damp out the 
low frequency ( smooth ) components. Multigrid may be used in conjunction with these iterative 
schemes to form a powerful solver by removing these smooth error components. Representation of 
these smooth errors on the multigrid coarse spaces means that they appear rough on these spaces 
where they may be damped out effectively. 

The construction of these coarse spaces may be done in several ways and the most obvious 
one is to simply retriangulate the domain with a larger mesh spacing. This however is a very 
expensive procedure especially for meshes required for Navier-Stokes simulations which can have 
very complex geometries. These methods are termed Geometric Multigrid (GM) and they make full 
use of geometry. Another way is by nodal decimation which involves selection of a vertex subset and 
retriangulation. The selection process is typically based on fine grid geometry and depends on some 
pattern in the fine grid [32], Depending on the pattern, different coarsenings arise. Calculations in 
the inviscid regions of the mesh use a full coarsening technique which gives a 4:1 reduction in 2D, 
an example of which is given in Fig. 18(a). However, alleviation of the stiffness due to stretched 
grids in viscous flow calculations requires semi-coarsening [33] which gives a smaller reduction. We 
refer to [34] for other references on this. 

In contrast to geometric multigrid, another promising avenue is Algebraic Multigrid (AMG) 
which uses an algebraic definition for the coarse spaces by agglomeration of the finite element 
subspace on the fine grid [35]. A purely algebraic definition allows for automatic construction of 
the coarse spaces and does not require geometric information. However, the smoother and the 
coarsening algorithms need to be carefully matched. The agglomeration technique is typically 
nodal [36, 37, 38, 39] which results in the Additive Correction Multigrid (ACM) method. However, 


40 



(a) Vertex Agglomeration (b) Element Agglomeration 

Figure 18: Agglomeration Types 

another effective method is through elemental agglomeration which involves the agglomeration 
of neighbouring elements into macroelements as shown in Fig. 18(b). Hence, the coarse space 
elements are not standard elements and as such, the coarse space meshes are not proper meshes. 
Appropriate basis functions as well as transfer operators need to defined. Perhaps the most recent 
development in this area is by Chan et al [34, 31] and the results have been shown to be promising. 
This coarsening technique is based on the underlying graph of the fine grid and does not involve 
geometry. The technique produces a set of node-nested coarse spaces which can be retriangulated 
based on fixed patterns in the agglomerated macroelement. This method offers great potential 
since the proposed interpolation operators are based on integers and leads to savings in storage and 
CPU time. Also, the algorithm recovers the natural structure of the coarse grids if the fine grid 
is regular. However, since the algorithm is purely topology-based, it does not distinguish between 
anisotropic and isotropic mesh regions which may lead to decreased convergence of the multigrid 
procedure. We propose a new and simple technique for defining coarse spaces which are properly 
nested in both the elemental and nodal sense. This method represents a hybrid between geometric 
and algebraic multigrid. 
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4.2 Multigrid Principles 
4.2.1 Multigrid Requirements 

In order to solve the linear system Eq. 53 using multigrid, the definition of the coarse spaces as 
well as the representation of the error components on these coarse spaces must be defined. Let 
{^>k '■ (k = 0, . . . , m)} represent the hierarchy of finite dimensional coarse spaces along with the 
associated coarse grids. Also, let {A k : (k = 0,...,m)} be the approximations of A on these 

subspaces such that Ao = A. In order to represent the error in one space on the next coarse space, 
we require a transfer operator called the restriction 

R k : tffc -> (54) 

which acts as a mapping between these spaces through a reduction in the space dimension. Also, 
error correction on the fine space from the coarse space requires the transfer operator called the 
prolongation 

Pfc : -> tffc-l ( 55 ) 

which acts as a reverse mapping. To complete the picture, we require a smoothing operator S k 
which acts to reduce the rough error components on each subspace v l ; k . These smoothers may be 
different on each grid but are typically chosen to be the same e.g GauB-Seidel . The generalized 
multigrid cycle now reduces to 

1. Perform iq pre-smoothing sweeps on the fine grid. 

< ^ = ^- 1 + S fc (6 fe -A fc <^- 1 ) (p = 1 ^i) 

2. Restrict the equation residual from the fine grid to the coarse grid. 

5jfc+l — ^k{^k ~~ ^k^k.pre) 

3. Solve on coarse grid and compute the coarse grid correction. 

<Pk + 1 = A^bk+i 

4. Prolong the correction back to the fine grid from the coarse grid. 

0k,corrected 4*k,pre + Pjt^jt+1 
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5. Perform u 2 post-smoothing sweeps on the fine grid. 

4*k ~ file, corrected 

<t? k = + (p = !,••■ ^2) 

Depending on the schednling of operations between the spaces, we end up with different flavors 
of multigrid cycles such as the V-cycle, W-cycle and F-cycle ([40]). In the algebraic multigrid 
context, the coarse grid approximations Ajt to A are defined using the formula 

A k +\ = RfcAfcP k (56) 

In our implementation of multigrid, the coarsening procedure terminates when the coarse grid 
operator A k is small enough to solve exactly. 

4.2.2 Galerkin Form for Transfer Operators 

While the restriction and prolongation operators are independent, a Galerkin form of these opera- 
tors requires that 

R* = Pi 

Given the fact that a correction from the coarse space is required to remove the smooth error 
components from the fine space, it is natural to seek the best possible correction. Let <p k +\ represent 
the correction from the coarse grid and (f) k the current solution on the fine grid. The error in the 
solution after correction is thus 

e k = <f>k + Pfc</>fc+i - A fc lfo fc 

Let us measure the error in the A-norm, || • ||a and minimize the error: 

min F(<f>k+i) = min ||(^ + P*,^fc + i) - A^ 1 ^ ||a ( 57 ) 

4>k + 1 0A:+1 

= min (cj) k + P k (f> k+ i — A^' 1 f>jt) T A/fc (cj> k + Pfc</>fc+i — A k l b k ) 

4>k+ 1 

Differentiation of the quadratic form with respect to <f>k+i gives 

(A*. + AjjT )(4>k + Pk^k+i “ l bk) = 0 (58) 

For a symmetric matrix A&, we may easily solve for < fih+i an d obtain 

4> k+ x = {PlAkPkT^Ubk-AM (59) 
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Further examination of the Hessian shows that this is a minimum. Comparing this formula to the 
A*; in Eq. 56, we find that 

R k = P[ (60) 

where the restriction is now defined as the formal adjoint of the prolongation: 

R/t = n ( 61 ) 

The same argument cannot be made for a non-symmetric matrix since the use of symmetry in 
the proof may not be used any longer. However, it can be seen that even in this case, Eq. o9 
corresponds to a stationary point but no information may be gleaned from the Hessian. However, 
if we measure the error in the residual in the L2-norm and follow a similar proof, we again come 
to the same result as Eq. 60. 

4.2.3 Fixed Point Iterative Methods 

Let us consider a splitting of the matrix A in the following form: 

A fc = Mfc — Nfc (62) 

where M* is non-singular. The basic idea behind preconditioning is to obtain a matrix M k such 
that M*, « A*, and inversion of is much less expensive than A*,. A basic iterative method is 
defined as the following linear fixed-point iteration: 

<^ +1 = M^N^ + M (63) 

= 4 + M k l {b k -A k <f>i) 

The matrix M*, is known as the preconditioning matrix and matrix S* = ’N*. = I — M fc 1 A^ is 

called the iteration matrix or smoother. Damping may also be taken into account by defining: 




(64) 

4 +1 

= u)(j> k 2 + (l — 

(65) 


= 8lti+u>M?b k 

(66) 


where 

S* k = wSfc + (1 - 0»)I (67) 
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For a given coarse space, let the exact solution be (j . ^ and the error in the solution be 


4 k =4> l k-<t>k ( 68 ) 

T his error is controlled by S k in the following fashion: 

e^t 1 = 4 +1 -^ 

= S k 4 + M^A k <t> k -I<t>k 
= s k <4 - s k <t> k 

= S k e^ k (69) 

The convergence property of the iterative method (63) can be summed up in the well known result: 
Theorem 1 Convergence of (63) for any initial guess u Q is equivalent to 

p { s*) < i ( 7 °) 


4.2.4 Multigrid as a Fixed Point Method 

Multigrid may also be thought of as a fixed point method and this can be shown fairly easily foi 
the V-cycle multigrid cycle. We consider the general V{vi,v 2 ) cycle for the two-level method but 
simplify it by assuming that we have only one pre-smoothing and one post-smoothing i.e a V(l,l) 
cycle. Let A represent the fine grid matrix and A represent the coarse grid matrix. For an initial 
guess (j>^ = 0: 

1. Pre-smoothing: <f>^ = S T b 

2. Coarse grid correction: 

(a) Restrict residual: 

g< 0) = R(I - AS T )b 

(b) Coarse grid solve: 

qW = A _1 R(I — AS T )b 
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(c) Fine grid correction: 

4>W = 0(»+R T qW 

= [S 7 ' + R t A^ 1 R(I — AS r )]6 

3. Post-smoothing: 

M -1 6 = 4> {2) + S{b- A(j) {2) ) 

= [S + S T - SAS t + (I - SA)R t A _ 1 R(I - AS t )]6 

The multigrid iteration matix S mu itigrid now takes the form: 

S multigrid ^ 

= (I-SAXI-R^^RAXI-S 7 ^.) 

For the extension to multiple levels and variable number of pre- and post-smoothing sweeps, we 
refer to [31]. 

4.2.5 Convergence Conditions 

In order to obtain mesh independent convergence properties, certain conditions must be met by 
the transfer operators. The definition of the subspace is obtained by interpolation in VT - 1 1 
and according to the analysis in [31], these subspaces must satisfy stability and approximation 
properties to ensure convergence of Eq. 63. These properties are: 

||Rfcu||i,n < C|M|i,n, (stability) (71) 

||RfcU — u||o,n < C7i||u||i,n (approximation) VueH^fl) (72) 

and as noted, special care must be taken in defining Ha . Another condition as outlined in [40] is 

mp + mn>2m (73) 

where the orders m P , m R of and R fc are defined as the order (degree plus one) of the polynomials 
that are interpolated exactly by and R fc , and 2m is the order of the governing partial differential 
equation. 

The use of nodal agglomeration in ACM to construct the subspaces results in the definition of 
the restriction as an injection operator. This has several associated problems. The first is that in 
3D, this operator violates the stability property (71) ([31]). Secondly, both m, P and m R are unity 
and for the Laplacian operator (2m = 2), the accuracy condition (73) is violated ([40]). 
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4.3 Agglomerated Coarse Space 

In this section, we now describe the construction of the coarse spaces as well as coarse space basis 
functions and multigrid operators. Our goals in the agglomeration and associated construction of 
the coarse space basis functions are: 

1. The coarse grid matrix defined on the coarse grid should be a good approximation to the 
fine grid matrix Ao- 

2. The coarse grid should have a reduction of anisotropic effects from previous coarse spaces. 

The proposed algorithm is based on the fusion of coarse space elements into macroelements with 
subsequent definitions of the coarse grid topology and basis functions. This method is applied 
recursively to generate the hierarchy of coarse spaces. One essential difference between this method 
and that proposed by Chan et al is that the coarse mesh elements are not converted into stan- 
dard elements by a retriangulation but are generalized polygons formed by the agglomerated fine 
mesh elements. This is especially attractive in 3D because of the complicated rules which may be 
involved for the retriangulation method described in [34]. Although the support for the basis func- 
tions defined on these macroelements is larger than standard triangular elements, a well designed 
agglomeration should relieve some of this. This algorithm also has the feature that the resulting 
coarse grid topology is both node-nested and element-nested. 

4.3.1 Coarse Space Topology 

The coarse grid topology is constructed by partitioning of the elements into macroelement groups 
as shown in Fig. 19. A macroedge is defined to be the ordered collection of fine grid edges which 
are shared by two neighboring macroelements. To complete the definition of the coarse grid graph, 
the coarse nodes are chosen to be the fine grid nodes where three or more macroedges meet. This 
is the reverse of what is described by Chan et al where the coarse grid points are first defined and 
then the macroelements are chosen. 

Two different element partitioning algorithms have been developed and both are similar in the 
sense that they are based on elemental accretion across edges using some measure of the coupling 
strength of the vertices making up that edge. They however differ in that one tries to alleviate grid 
anisotropy while the other typically introduces grid anisotropy in regions of stiff matrix coefficients. 
The second algorithm is based on a semi-coarsening algorithm using the subspace matrix as the 
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Macroelement 


Figure 19: Coarse Space Topology 

control parameter while the first is a reduced-geometry based algorithm which attempts to alleviate 
stiffness in regions of highly stretched grids. 

Matrix Based Agglomeration 

The matrix based algorithm has its origins in the pure AMG implementation of the ACM 
algorithm. However, modifications are required in order to apply this to element agglomeration 
as opposed to vertex agglomeration which it is was originally designed for ([37]. The basic idea 
is that strong matrix coupling between vertices in the graph of the mesh typically corresponds to 
a weak coupling between the vertices in the dual graph. The strength of the coupling is typically 
based on the stencil coefficients of the matrix. We will describe later, the implemented definition 
of the vertex coupling strength in relation to the development of the implicit line solver. Since the 
dual graph vertices correspond to the elements, agglomeration of elements across edges with weakly 
coupled vertices is equivalent to directional agglomeration of strongly coupled vertices. 

Using this principle, we may now develop an algorithm which directionally clusters elements 
in order to relieve stiffness in the next coarse space. The accretion is performed with a Breadth 
First Search (BFS) / Greedy algorithm which maintains a queue of elements sorted according to the 
relative coupling strengths of the vertices on their bounding edges. We now present the algorithm 
in detail: 

Algorithm 1 (Matrix Based Macroelement Construction) 
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Figure 20: Matrix Based Agglomeration in Boundary Layer 

Step 0: Consider the graph of the mesh: G = ( V,E ) and calculate the coupling strengths for the 
edges E in the graph. 

Step 1: Seed element procurement. Obtain a seed element to initialize the BFS algorithm. If 
there is no seed element in the queue, choose any suitable element which does not belong to a 
macroelement group. 

Step 2: Perform accretion around the seed element by recursively considering the neighbouring 

elements. 

Step 3: Repeat Step 2 until the macroelement has desired number of elements or heap contains 
no more elements. 

Step 4: If the number of elements agglomerated is less than a specified fraction of the desired 
number of elements (usually ±), these agglomerated dements are ungrouped and the original 
seed element is marked. 

Step 5: Repeat Step 1 until all elements either belong a macroelement or has been marked as an 
invalid seed. 

In Step 2 above, a heap is maintained whose members contains a key pair consisting of element 

and face identifiers. Initially, seed element is placed on the heap. The head of the seed is then 

popped for the current seed element and added to the macroelement list. For every neighbor 
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of the popped element which does not belong to a macroelement, the connection st length of the 
separating edge is checked. If the edge is considered to be weak, the neighboring element is inserted 
into the macroelement. For every newly inserted clement, the neighbors which do belong to any 
macroelement are considered and the corresponding element/ edge key identifier pair is inserted into 
the the heap according to the edge connection strength. 

After the algorithm terminates, post-processing is necessary to deal with exceptions which may 

arise. These are described below: 

(1) Elements not in any macroelement: Elements which are marked as invalid seeds and have 

not been absorbed into a macroelement will end up as lone elements. These elements are 
merged with the neighboring element /macroelement that shares the edge with the weakest 
connectivity. 

(2) Insufficient number of elements: After exception 1 above, a macroelement may end up 

with an insufficient number of agglomerated elements. The macroelement is divided up 
amongst neighboring macroelements by erosion of the boundary elements until there are 
no more elements. The decision as to where an element goes is also based on the edge 
connectivity. 

(3) Insufficient number of coarse nodes: A macroelement may also end up with two or less 

coarse grid nodes which presents a problem in the construction of the transfer operators. 
These macroelements are also divided up amongst neighboring macroelements. 

This algorithm is closely tied to the line solver and can be particularly effective for equations 
with strongly preferential directions. Extension to 3D would be straightforward if a suitable face 
connectivity can be defined. If the connection strength for all edges is defined to be a constant, then 
an isotropic agglomeration algorithm is recovered. Unfortunately, since there will be no preferential 
direction, there is no real control in the regularity of the coarse grid and self similar meshes cannot 
be obtained for structured, topologically rectangular meshes. 

Geometry Based Agglomeration 

The geometry based algorithm is based on the idea of removing grid anisotropy as well as 
maintaining isotropy in the isotropic regions of the mesh. This is related to the work done by 
Mavriplis [33, 41] except that it is applied to elements as opposed to vertices. The proposed 
algorithm makes use of the edge lengths only and this represents a reduced geometry method. The 
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reduced geometry for the lower level coarse spaces is defined entirely in terms of the fine grid i.e 
the macroedge lengths are simply the sum of the edge lengths of the constituting fine grid edges. 
The decision to agglomerate two neighboring elements is based on a geometry based connectivity 
concept which we term macroelement skew. 

Definition 1 For a general polygon, the polygo?i skew is a measure of anisotropy and is defined as 
the area of the n-gon divided by the area of a perfect n-gon luith the same perimeter. 

In the extreme cases, this is zero for colinear polygon vertices and unity for a perfect n-gon. It can 
be seen that this readily extends to 3D. The macroelemental areas for the coarse spaces are also 
easy to compute as they are simply sums of the agglomerated element areas. This makes it easy to 
apply the algorithm recursively once the required geometry variables have been computed on the 
finest mesh. In order to complete the operators required for this algorithm, we need to define the 
edge connection strength which we term edge skew. 

Definition 2 For an element which borders a macroelement /element on a given edge, edge skew 
is defined as the macroelement skew of the macroelement which would be created if the element is 
merged with the macroelement/element across that edge. 

We now present the algorithm in detail: 

Algorithm 2 (Geometry Based Macroelement Construction) 

Step 0: Consider the graph of the mesh: G = (V,E) and calculate the edge length for the edges E 
in the graph. 

Step 1: Initialize seed queue. 

Step 2: Seed element procurement. Obtain a seed element to initialize the algorithm. If there 
is no seed element in the queue, choose any suitable element which does not belong to a 
macroelement group. 

Step 3: Perform accretion around the seed element. Place seed element in macroelement and for 
every neighboring element, compute the edge skew. Every neighboring element which has an 
edge skew larger than some specified fraction (typically 0.75) of the average edge skew and not 
in a macroelement is placed in the macroelement. 


51 


Step 4: Enqueue seed elements. New seed elements are placed in the queue to continue the algo- 
rithm. These are chosen to be elements which share a vertex but no edges with the macroele- 
ment. In 3D, this would extend to elements which share a vertex and/or an edge but no faces 
with the macroelement. 

Step 5: Repeat Step 1 until all elements either belong a macroelement or there are no more seed 
elements. 


A queue of seed elements needs to be maintained and hence in Step 1 above, this is initialized 
with one element. This initial choice can be very important especially in cases of unstructured 
meshes generated from structured data. In such a case, the agglomeration pattern may radically 
depart from a 4:1 agglomeration in 2D since the accretion algorithm will not properly identify 
potential elements. In this case, simply pick an element with no domain boundary edges. 

After the algorithm terminates, post-processing is necessary to deal with “sliver” elements which 
may not have been picked up by the algorithm. A determination of which macroelement to merge 
these elements with is made a-priori based on edge skew. In the case where the lengths and areas 
are equal, the algorithm degenerates to a 4:1 isotropic agglomeration in 2D and fully recovers the 
natural coarse structure for a regular grid. 






(a) Matrix Based Agglomeration (b) Geometry Based Agglomeration 


Figure 21: Resultant Agglomerations Based on Different Agglomeration Algorithms 
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Fig. 21 shows the difference between the two algorithms in the case of a cosine Ni Bump with 
9600 elements and 4961 vertices. The underlying geometry is generated from structured data and 
the geometry based algorithm nicely recovers the natural coarse grid. 


4.3.2 Coarse Space Basis Functions 

The transfer operators may now be defined based on the constructed macroelements. The GCA 
formulation is in effect so it is sufficient to construct the prolongation operator only. We would like 
the basis functions to at least satisfy the stability (71) and approximation (72) conditions, preserve 
the constant function, and behave like standard interpolants i.e 




1 if i = j 

0 if i ^ j 


(74) 


The construction of the proposed basis functions makes use of topology and reduced geometry 
if provided. If the geometry is not given, then the elements are assumed to be isotropic. We now 
define the basis functions using graph distance interpolation on both the boundary and interior. If 
the geometry is available, this is used in combination to form a more accurate interpolant.. This is 
an improvement over the interpolation proposed by Chan et al which makes use of graph distance 
interpolation on the boundary and constant interpolation over the interior. This algorithm leads 
to a quasi-linear interpolant as shown in Fig. 22. The detailed algorithm is given below: 



Algorithm 3 (Basis Function Construction) 
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Step 1: For each macroelement, create a local subgraph. In the process, create an ordering of the 
boundary edges such that the boundary can be traversed. 

Step 2: Extract the list of interior vertices. Extract the ordered list of coarse grid vertices by 
traversing the boundary edges. 

Step 3: For all fine grid edge vertices which lie between consecutive coarse grid nodes, construct 
length weighted interpolation data. The macroedge length is also computed simultaneously. 

Step 4: Interior vertex interpolation. For each coarse grid node in the macroelement, a BFS 
iteration on the local subgraph is performed with the coarse grid node as a seed. Both the 
level set as well the distance from the coarse node is recorded for all interior nodes in the 
subgraph during the process. The graph distance of each macroelement fine grid node from 
the macroelement coarse grid nodes is then set. For each fine grid node, these distances are 
then weighted to sum to unity. 

4.3.3 Scaling Issues 

The success of the multigrid methods depends heavily on how good of an approximation the coarse 
space matrices A k are to A. In the GCA formulation, special care must be taken to ensure that 
that these approximations are accurate enough. The construction of the prolongation operator is 
typically not a problem. However, the definition of the restriction operator needs to be modified 
slightly. Let us choose the restriction operator to be 

Rfc = l (75) 

where P£ is the formal adjoint of the prolongation and a is a suitable scaling factor. The scaling of 
R fc is determined by the role of R* . If Rjt is to be used to construct coarse grid representations of <f> k 
(i.e R k (f> k ), then V R A .(i,j) = 1. However, if R is to be used to transfer residuals to the coarse grid, 
then the correct value of the scaling depends on the scaling of the fine grid and coarse grid problems. 
This implies that the coarse grid problem should be consistent with the differential problem in the 
same way as the fine grid problem. This is the basic problem with vertex agglomeration. Let H 
represent a characteristic mesh size on the coarse grid and h represent a characteristic length on 
the fine grid. Finite volume and finite element schemes in 2D lead to a scaling rule which says that 
y~V R fc (j,y) = (y) 2 which can be viewed as a ratio of the area associated with a coarse grid node 
to the area associated with the counterpart fine grid node. 
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The basis function construction algorithm thus needs to be modified to take this into account. 
The associated area for the nodes on both the fine and coarse grids is computed by looping through 
the elements on a grid and sending elemental area contributions (element area divided by the 
number of element vertices) to these vertices. The diagonal scaling matrix a is now computed. 
However, we would like to maintain the GCA formulation, so the system is symmetrized by defining 


p 

= Per* 

(76) 

R 

= cr*P T 

(77) 


= P T 

(78) 


This formulation has the nice feature that the eigenstructure is preserved for an SPD matrix. 
The coarse grid equations are now constructed using the GCA approximation and the multigrid 
operation continues with this new definition for the transfer operators. 

4.4 Implicit Line Smoother 

In the context of our multigrid formulation, we would like to be able to solve the high Reynolds 
number Navier-Stokes equations which is a parabolic system characterized by both advective and 
acoustic modes. We would like to decouple the two modes in the following way. Multigrid methods 
are very effective at damping out elliptic error modes such that the choice for the smoother must 
be that it can handle the advective error components. Following this reasoning, we have opted 
to use an implicit Gaufi-Seidel line relaxation scheme where the lines are constructed to follow 
characteristic directions. The use of a line relaxation scheme leads to a natural splitting of the 
matrix into tridiagonal submatrices which may be solved by any of the myriad tridiagonal matrix 
solvers. The general rule behind the line relaxation is that points which are strongly coupled should 
be updated simultaneously. This leads to the description of how the implicit lines are constructed. 

4.5 Implicit Line Construction 

The implicit line construction process is based on the philosophy of linking strongly coupled nodes. 
In order to reduce the amount of work in the line smoother, minimal overlap between the lines is 
allowed. To properly describe the algorithm, we need to define two terms: 

1. Coupling measure: The coupling measure between two nodes gives a local quantification 
of the connectivity between these nodes. Typically, this is based on the matrix stencil con- 
necting these points. Ideally, the measure of the coupling between the nodes should be based 
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on a discretization of a scalar convection-diffusion equation but other measures such as an 
approximation to the flow velocity or streamlines may be used. This becomes even more 
complicated in the case of block systems of equations such as arise in discretizations of the 
Navier-Stokes equations. It may however be possible to use entropy variables to form an 
approximation of the scalar convection equation. 

2. Coupling degree: The coupling degree between two nodes gives a quantification of the 
connectivity between these nodes as compared to other connected nodes. This is based on 
the coupling measure earlier defined. In the current framework, the degree of coupling between 
two nodes VI and V2 is determined by first computing the coupling measure of the nodes to 
all surrounding nodes and this measure may be freely defined. For each node, the maximum 
value is taken to be the reference value for that node. From the point of view of VI, the 
coupling between nodes VI and V2 will be considered strong if the connectivity measure 
between these nodes is larger than a threshold value. This threshold value is defined to be 
a fraction of the VI reference value. Two nodes are linked up in the line if the degree of 
coupling between them is stonger than any other connection. In advection dominated flows, 
strong coupling tends to be one-sided which is why a two way consideration (i.e from both 
points of view of VI and V2) is necessary. 

Line construction is done in a two pass process. The first pass involves the construction of 
individual lines while the second pass involves merging lines to reduce the line count and improve 
convergence. The construction of a single line begins by choosing a seed node. All nodes which 
do not exist in a line are placed in a queue in no particular order and the seed node is chosen 
from this queue. The chosen seed is usually not an extremety of the line based on a straightforward 
implementation of the algorithm. Hence, we need to introduce the concept of forward and backward 
mode line construction. 

4.5.1 Forward Mode Line Construction 

Forward mode line construction involves stepping through a line by starting at a given node and 
simply choosing the next node with the best strong connection which is not in the current line. 
The best connection, however, may be a node which already exists in another line. Hence, in an 
effort to minimize overlap, the next best node which has a strong connection and has not been 
assigned to a line is chosen. If no such node exists, then the originally selected node is chosen. 
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If the number of times a node which has already been assigned to a line is chosen reaches some 
predetermined limit, then the line is terminated and the overlap nodes are removed. Also, if the 
next chosen node is an extremety of another existing line, then the current line is simply merged 
with that line. Flows with recirculation regions or circular flows give rise to cyclic couplings, so 
this case is specially handled if the chosen node turns out to be the head of the current line. 

4.5.2 Backward Mode Line Construction 

Backward mode line construction is akin to the reverse of the forward mode. For a given node, all 
the adjoining nodes are scanned and a single step of the forward mode algorithm is performed on 
these connected nodes. If any of these connected nodes would have chosen the starting node as the 
next node in the line, then this node is chosen as the next node. In the event that multiple nodes 
would have chosen this node, then the one with the strongest coupling is chosen. As in the forward 
mode, if the chosen node is an extremety of an existing line, then the current line is merged with 
that line. 

4.5.3 Line Processing 

The line is constructed by performing backward and forward mode construction from the seed 
point. After the two halves of the line have been constructed, they are merged together into a 
single line. The post-processing pass is performed once all the nodes have been assigned to reduce 
the line count. This is done by checking the extremeties of every line and testing to see if the 
node on the extremety has a strong connection to a node on the extremety of another line. The 
connection threshold for each node is reduced by a factor (typically between 0.5 and 0.75) to allow 
more lines to be considered. 

Fig. 23 shows a 2-level example of the implicit line construction on the grids. The agglomeration 
shown is the geometry based algorithm. 
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(c) Level 1 Agglomeration 


(d) Level 1 Implicit Lines 


Figure 23: Multilevel Agglomeration and Implicit Lines 
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4.6 Results 


The performance of the algorithm for different flow regimes and characteristics is now presented. 
As mentioned by Chan ([31]), the performance of many numerical techniques for elliptic problems 
are not robust for convection dominated flows. Two particular kinds of flows which are known to 
cause divergence for many techniques are boundary layer flows and recirculation flows. 


4.6.1 Boundary Layer Flow 


We consider the linear convection diffusion equation (Eq. 45) over a square domain ft =]0, 1[ 2 and 
prescribed velocity field U = (-y,x). The forcing function / is set to zero and the dirichlet boundary 
on the inflow and left wall (x = 0) is 


oto = < 


This particular set of conditions is chosen to simulate a boundary layer flow with the nominal 
Reynolds number 
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(79) 


The discretized domain is adapted on the dirichlet boundary to capture the boundary layer as 
shown in Fig. 23(a). All presented results are based on a V(l,l) multigrid cycle with no FMG. 
The agglomeration technique is the geometry based algorithm and the solver is terminated when 
the RMS absolute error in the residual is less than 10 -13 . The relaxation factor u) chosen for all 
the test cases was 0.95. The behaviour of the algorithms for a variety of parameter states is now 
presented: 


Multigrid Level Dependency: 

The dependence of the convergence rate on the number of coarse spaces is shown in Fig. 24. The 
fine mesh has 60399 vertices and 119714 elements and a total of 6 coarse spaces were constructed. 
In the asymptotic limit, the convergence rate is the same for all the curves and beyond the two-grid 
case, the curves fall unto the same line. 
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Figure 24: Multigrid Level Dependency for Boundary Layer Problem: 60399 points; Re = 1.0e6 
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Figure 25: Reynolds Number Dependency for Boundary Layer Problem: # of points = 3849 
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Figure 26: Reynolds Number Dependency for Boundary Layer Problem: # of points = 15763 
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Figure 27: Reynolds Number Dependency for Boundary Layer Problem: # of points = 60399 
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Reynold Number Dependency: 

The dependence of the convergence rate on the Reynolds number is shown in Fig. 25, Fig. 26 
and Fig. 27 for a range of Reynolds numbers from 10 2 to 10'. Figures 25 to 27 were generated 
on a sequence of fine meshes with 3849, 15763 and 60399 vertices respectively which represents an 
approximate halving of the mesh spacing. In all cases, we find a similar asymptotic convergence 
rate. Even more important is the fact that the algorithm works well for such a wide range of 
Reynolds numbers while maintaining a fairly constant bound on the number of iterations required 
for convergence. 

A noticeable trend can be observed with the Re = 100 case, which is the increasingly pronounced 
stall in the residual after a few iterations followed by convergence. The problem is fairly elliptic 
such that well defined characteristic lines are not easily identifiable. As a result, the line solver 
does not facilitate proper propagation of information in the characteristic directions. For such a 
low Re problem, the use of a point implicit Gaufi-Seidel smoother would be a better choice. 
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(a) # of points = 60399 (b) # of coarse grids = 4 

Figure 28: Timing Information: Re=1.0e6 

Fig. 28(a) shows timing information for the algorithm on the boundary layer problem. The 
Reynolds number is 1 million and the fine mesh has 60399 points. The plot shows the CPU time 
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required for the solution versus the number of multigrid levels with and without the preprocessing 
time included. This preprocessing time includes the time required to construct the coarse spaces 
and implicit lines as well as factorization of the coarsest grid matrix. Fig. 28(b) shows the CPU 
time required for the solution versus the number of fine grid vertices for the three meshes mentioned 
above with the number of coarse meshes fixed at 4. 

4.6.2 Recirculation Flow 

We also consider the linear convection diffusion equation (Eq. 45) over a square domain 0 =] — 1, 1[ 2 
(Fig. 29) and prescribed velocity field U = (-y,x). Neumann boundary conditions |^ = 0 are 
imposed on the domain boundaries and the flow field is initialized with random values ranging 
from -1 to 1 with a mean of zero. The nominal Reynolds number is also defined as 

D U h • l h 

Re — 

v 

1 

v 

The discretized fine mesh has 1990 points and an example of a solution converged to an RMS 
residual tolerance of 10 _13 is shown in Fig. 30 for a Reynolds number of 1 million. The range of 4> 
shown is between —1.17 x 10“' and 7.94 x 10 -9 . 

The results in Fig. 31 show the dependence of the convergence rate for a number of Reynolds 
numbers. There is significant deviation of the convergence rate as the Reynolds number increases. 
It can be seen that for the higher Reynolds number cases, the algorithm has increasing trouble in 
damping out certain modes. This is due to the nature of the implicit lines in the context of an 
unstructured grid. Due to the unstructured nature of the grid, these lines do not wrap around 
on themselves (Fig. 32) and as such, the system of equations for the lines are not periodic. This 
means that information cannot be propagated properly along the characteristic lines with resulting 
deterioration in convergence rate. However, there is a reduction of six orders of magnitude in the 
residual before this effect becomes noticeable, so it is still possible to use the algorithm in the role 
of a preconditioner to Krylov subspace solvers. 
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Figure 30: Circular Convection Solution Contours: # of points = 1990, Re=1.0e6 
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Figure 32: Fine mesh implicit lines 
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